Greg Detre
Monday, April 21, 2003
learning
from imitation
efficient motor learning
connection between action and perception
modular motor control in the form of motor primitives
development
of humanoid robots
Honda
humanoid robot
�As it is impossible to search such huge spaces for what constitutes a
good action, it is necessary to either find more compact state-action representations,
or to focus learning on those parts of the state-action space that are actually
relevant for the movement task at hand. In the following article, we will
review how the latter topic can be approached in the framework of imitation
learning, while the former topic, i.e., compact state-action representations,
will be shown to be a natural prerequisite for imitation learning in the form
of movement primitives�
teleoperation???
how might
you represent state-action pairs more compactly???
how do movement primitives help??? is it because they�re aggregations of
state-action pairs???
you need a
goal to imitate the right aspects of an action, and you really need language to
communicate goals�
how do you
combine imitative with reinforcement learning???
task
strategy vs task goal???
I suppose the strategy is your (relatively high-level) means of going
about things, and your goal is the end-point or abstract description of the
task that you�re trying to achieve�
it will never be the �same� goal � always a
mapped goal (i.e. swapping �him� for �me� etc.)
that�s a hard problem in itself
rough
definitions: accommodation is reuse, assimilation is recognition
are these the Piagetian definitions???
are the rest of them standard/agreed definitions???
are �control policy� and �movement primitive� synonymous??? is a CP a
series of MPs???
�In infant
and animal studies, the ability to imitate is usually concluded from the
subject�s increased tendency to execute a previously demonstrated behavior.
However, other causes can equally account for a higher probability of the
subject�s behavior, in particular priming, emulation, and response facilitation
(Glossary); such causes are not to be mistaken with true imitation (8, 9). True
imitation is present only if i) the imitated behavior is new for the imitator,
ii) the same task strategy as that of the demonstrator is employed, and iii)
the same task goal is accomplished�
why does
imitation add more than supervised learning to the learning of communication???
(pg 5)
how does
the F5 homologue to Broca�s area thing work??? pg 7
important
distinction made between program- and action-level imitation
�Action Level Imitation: The indiscriminate copying of the actions of
the teacher without mapping them onto more abstract motor representation.
Program Level Imitation: A process by which the structural organization
of a behavior is copied from observing a teacher, while the exact details of
actions are filled in by individual learning.�
but I don�t
understand task-level learning:
�Task Level Learning: Learning of a task can take place by learning an
appropriate Control Policy that generates commands u on the actuator level, or
by learning a Control Policy that generates commands in a more abstract but
task related space, e.g., the space of the finger tip. The latter approach is
called task-level learning and it requires additional transformations to map
the task-level command into actuator space. Usually, errors in performance are
more associated with task commands than actuator commands.�
important
difficult definitions:
�Control Policy: A function that maps the state x of a movement system
and its environment into an appropriate action u for a particular task, i.e., u
x = p(x, t, a). As indicated, the function π can directly depend on the
time, t, and some additional parameters α that may be useful to adjust the
policy for a particular task goal. Movement Primitives can be formalized in the
form of control policies.�
�Movement Primitive: A sequence of actions that can accomplish a certain
movement goal. See Control Policy for a more formal definition.�
movement primitives (pg 9)
sequences of action that accomplish a complete goal-directed behaviour
can be as simple as an elementary action in the symbolic approaches to
imitation, e.g. go forward
do not scale well with many DoFs
interesting
result:
connection from somatosensory cortex to the superior temporal sulcus
(STs) in macaques � most of the form and motion neurons were insensitive to
self-motion due to re-afferent signals
what else
is required for full imitation than is provided by the mirror neurons??? pg 7
�some neurons in F5, called �mirror neurons�, were active both when the
monkey observed a specific behavior and when it executed it itself (37). Mirror
neurons fire highly specifically only to a special motor behavior with a
particular object. These results are similar to those in STs (28), with the
difference that neurons in STs do not respond to executed motor acts, but
rather only to perceived ones.�
what�s the
human analogue of STs (macaques)???
what�s the likely human analogue of the STs�7b�F5 macaque imitation pathway???
is there any reason to believe there�s just one??? more than one???
I suppose we�re only talking about visually mediated motor movements�
they�ve specifically ruled out verbally mediated�
might the pathway be slightly different for imitating non-humans???
after all, since the mirror neurons are specific to humans, and certain body
parts and certain actions, it�s quite possible that there will be one pathway
for imitating some (recognised) actions (or whose goal is known) and other
pathways for purely action-level imitation, say, right???
early
symbolic approaches to imitation learning
state-action-state sequence was converted into if-then rules
this sort of FSM (???) is doomed because it will suffer from
combinatorial explosion
difficulty synthesising new movements??? difficulty seeing analogous
movements???
MPs � code
complete temporal behaviours, like �grasping a cup�, �walking�, �a tennis
serve� � compact state-action representation where only a few parameters
need to be adjusted for a specific goal
so the hard problem underlying an imitation system is building up a set
of useful movement primitives
presumably you also need a system for generating new MPs, and noticing
when to use them and when not to
�the perceived action of the teacher is mapped onto a set of existing
primitives in an assimilation phase�
I like the idea that once you�ve decided what you need to do, and stored
a high-level description of the action you�re trying to achieve, then you can
use supervised learning to improve your actual motor performance on this task
(this is also mentioned somewhere above)
should there be a continuum between task strategy and task goal �
presumably the task strategy will be expressed at a higher and higher level as
you get older, perhaps even in terms of the task goals you worked towards when
you were younger
or are task goals some different category of representations???
do mirror
neurons notice motor primitives then???
are there
some motor primitives at a higher level???
in fact, is there something lower-level than a motor primitive???
surely
what would you call it then???
can you
have imitation without any reinforcement signal???
surely
is there
something qualitatively different about imitating something (right) after just
one viewing???
presumably
the imitative abilities (and their representations) and motor control develop
simultaneously, right???
model-based
learning??? predictive feedforward model???
presumably you use supervised learning to build a supervised model of
what motor commands led to what change of state, then when you want to produce
some specific state, you can feed back to what motor command to use???
can you feed the information backwards like this through a feedforward
net???
I think so
�Interesting
insights into these methods [for learning novel behaviours by imitation] can be
gained by analyzing the process of how a perceived behavior is mapped onto a
set of existing primitives. Two major questions become a) what is the matching
criterion for recognizing a behavior, and b) in which coordinate frame does
matching take place?�
via-point
method???
presumably this is setting up waypoints in the teacher�s action that you
try and reproduce???
y, and they can be used for classification (and presumably evaluation of
success) too
various issues with translation, scale and rotation invariance
�the suggested bidirectional interaction between perception and action
is noteworthy�???
bidrectional interaction of generative and recognition models in
unsupervised learning
movement recognition is based on the movement generation system
�movement recognition based on forward models integrates smoothly with
the simulation theory of mind�
see Meltzoff and Moore�s �Active Intramodal Matching�
I like the
idea of forward models put into multiple-model competition � but you do need
some means of deciding between them
�imitation
learning could be conceived of as a research strategy that channels investigations
in computational motor control towards the important topic of action-perception
coupling�
is
�tweening� the right term for an arbitrary trajectory???
do the
feedforward models, via-point method and splining all do roughly the same
thing???
I reckon the feedforward models (especially if they�re in multiple
competition) are more general (and heterogenous) in the types of actions they
can represent
is the splining used to interpolate between the via-points???
I think a task-specific control policy chooses between the motor primitives�???
hmmm, maybe
or maybe they�re synonymous
I didn�t understand the need for or the distinction between movement planning and execution??? (pg 6)
endeffector???
splines as nonautonomous, �since the viapoints defining the splines are parameterised explicitly in time�???
not robust in coping with unforeseen perturbations of the movement